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The main focus of this work is on developing models for the ac- 
tivity profile of a terrorist group, detecting sudden spurts and down- 

■ falls in this profile, and in general, tracking it over a period of time. 
I Toward this goal, a d-state hidden Markov model (HMM) that cap- 
tures the latent states underlying the dynamics of the group and 
thus its activity profile is developed. The simplest setting of d = 2 
corresponds to the case where the dynamics are coarsely quantized 

■ as Active and Inactive, respectively. Two strategies for spurt detec- 
fSJ ' tion and tracking are developed here: a model-independent strategy 

that uses the exponential weighted moving-average (EWMA) filter 
to track the strength of the group as measured by the number of 
attacks perpetrated by it, and a state estimation strategy that ex- 
ploits the underlying HMM structure. The EWMA strategy is robust 
to modeling uncertainties and errors, and tracks persistent changes 
(changes that last for a sufficiently long duration) in the strength of 
C/3 ■ the group. On the other hand, the state estimation strategy tracks 

even non-persistent changes that last only for a short duration at the 
cost of learning the underlying model. Case-studies with real terror- 

■ ism data from open-source databases are provided to illustrate the 
' performance of the two strategies. 

1. Introduction. The increasing radicalization of divergent interest groups with dif- 
I ferent grievance profiles has led to an increased footprint of terrorist activity in many 

■ countries of the world. This threat has in turn led to an increased spending on counter- 
terrorism efforts that has shackled many of these economies (Mueller and Stewart, 2011; 
Belasco, 2006). In the light of these developments, it is critically important to track the 
activity of terrorist groups so that effective and appropriate counter-terrorism measures 
can be quickly undertaken to restore order and stability. In particular, detecting sudden 

I spurts and downfalls in the activity profile of terrorist groups can help in understanding 

the dynamics of the terrorist group, thereby allowing the establishment to predict and even 
game the response to certain policy measures. 

The basic element toward these goals is the collation of data on terrorist activities perpe- 
trated by a group of interest. In this regard, it is well- understood that collecting terrorism 
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data relative to more common forms of crime is a challenging task (LaFree and Dugan, 
2009). Despite these challenges, many databases such as ITERATE, the Global Terrorism 
Database (GTD) (LaPree and Dugan, 2007), the RAND Database on Worldwide Terrorism 
Incidents (RDWTI), etc., have made data on terrorism available in an open-source setting. 
However, even the best efforts on collecting data cannot overcome issues of temporal ambi- 
guity, missing data, and attributional ambiguity that leads to mislabeled data. In addition, 
the very nature of terrorism makes it a rare occurrence from the viewpoint of model learning 
and inferencing. Thus, in addition to fitting observed terrorist data sufficiently accurately, 
an ideal model for the activity profile should: i) be motivated by a small set of hypotheses 
and remain robust to a certain looseness in the hypothesis claims, ii) be described by few 
parameters, and iii) be robust to missing or mislabeled data. 

Prior work on modeling the activity profile of terrorist groups falls under one of the follow- 
ing three categories. Enders and Sandler (1993, 2000) use classical time-series analysis tech- 
niques to study both the short-run as well as the long-run spurt /downfall in world terrorist 
activity over the period from 1970 to 1996. On the other hand, LaFree, Morris and Dugan 
(2010) and Dugan, LaFree and Piquero (2005) adopt group-based trajectory analysis tech- 
niques (Cox proportional hazards model or zero-inflated Poisson model) to identify regional 
terrorism trends with similar developmental paths. The common theme that ties both these 
works is that the optimal number of underlying latent groups and the associated parameters 
that best fit the data remain variable. Further, the parameters are chosen to optimize a 
metric such as the Akaike Information Criterion (AIC), the Bayesian Information Criterion 
(BIC), etc., or via logistic regression methods. While a rich class of models ensures that 
reasonably acceptable model-fits can be arrived at, this is often at the cost of a large num- 
ber of model parameters and with an ad hoc class that captures complicated dependency 
relationships between the endogenous and exogenous variables. 

In addition, both sets of works take a contagion theoretic viewpoint (Midlarsky, 1978; 
Midlarsky, Crenshaw and Yoshida, 1980) that the current activity of the group is dependent 
on the past history of the group, which accounts for clustering effects in the activity profile of 
the group. This contagion viewpoint is put on a theoretical foundation in Porter and White 
(2012) where an easily decomposable two component self-exciting hurdle model (Hawkes, 
1971; Cox and Isham, 1980) for the activity profile is introduced. Self-exciting models have 
become increasingly popular in as diverse fields as seismology (Ogata, 1988, 1998), gang 
behavior modeling (Mohler et al., 2011), insurgency activity that mirrors inter-gang vio- 
lence (Lewis et al., 2011), epidemiology (Kim, 2011), etc. In its simplest form, the hurdle 
component of the model creates data sparsity by ensuring a pre-specified density of zero 
counts, while the self-exciting component induces clustering of data. However, the very 
nature of the self-exciting component implies that parameter estimation and inferencing 
algorithms are model-sensitive and have to be pursued on a case-by-case basis for different 
terrorist groups (and are hence not well- understood). 

Contributions: In contrast to the self-exciting model, we hypothesize here that the cause of 
an increased/decreased frequency in attacks is some change in the group's hidden states that 
reflect the dynamics of its evolution rather than the fact that the group has already carried 
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out attacks on either the previous day/week/month. In other words, while the self-exciting 
component of the model is best suited for earthquake activity (Ogata, 1988, 1998) where af- 
tershocks are a direct consequence of current activity, or inter-gang violence (Mohler et al., 
2011; Lewis et al., 2011) where attacks are retributive and often action-reaction conse- 
quences, activity modeling of terrorist groups is better motivated via a hidden Markov 
model (HMM) framework (Rabiner, 1989). We propose a d-state HMM for the activity 
profile where each of the d possible values corresponds to a certain distinct level in the 
underlying attributes. The simplest non-trivial setting of d = 2 with Active and Inactive 
states is shown to be a good model that captures most of the facets of real terrorism data. 

We then develop a changepoint problem formulation to quickly identify a sudden spurt 
(or a sudden downfall) in the activity profile of the group. To address this problem, we 
propose two detection strategies. The first strategy is a stopping-time based on the Ex- 
ponential Weighted Moving-Average (EWMA) filter that tracks the quantities of interest. 
This strategy does not require knowledge of the underlying model and is hence useful in 
settings where model learning is difficult or where model uncertainty is high. However, as 
a price for this luxury, only persistent changes (changes that last for a sufficiently long du- 
ration) can be detected with this strategy. The second strategy requires an estimate of the 
model parameters (that are learned over a training-period) to estimate the most probable 
state sequence using the Viterbi algorithm. We show that this strategy not only detects 
non-persistent changes, but also allows the identification of key geopolitical undercurrents 
(or events) that lead to sudden (major or minor) spurts/downfalls in a group's activity. 

The advantages of the proposed framework are many. The proposed d-state HMM is built 
on a small set of easily motivated hypotheses and is parsimoniously described by d[d + 1) 
model parameters. While observed data rarity can be explained by state transitions, clus- 
tering of attacks can be attributed to different intensity profiles in the different states. In 
addition to resulting in a good model-fit with real terrorism data, the proposed framework 
offers two significant advantages over existing approaches: i) the HMM paradigm allows 
the use of a mature and computationally efficient toolkit for model learning and infer- 
encing (Rabiner, 1989), and ii) it offers a smooth trade-off between accurate modeling of 
terrorist activity and computational efficiency. 

Organization: This paper is organized as follows. Section 2 illustrates some of the key 
qualitative features that describe the activity profile of terrorist groups and explains how 
prior work has modeled these features. In Section 3, a HMM framework is proposed to 
capture the key facets of the activity profile. Section 4 develops two strategies for detecting 
spurts and downfalls in the activity profile: a model-independent approach based on the 
EWMA algorithm and a HMM approach based on the Viterbi algorithm. The efficacy of 
the proposed approaches is studied with real datasets in Section 5. In Section 6, some of 
the key limitations in existing work that motivated the HMM framework are revisited and 
it is shown that the proposed approach overcomes these limitations. Concluding remarks 
are drawn in Section 7. 

2. Preliminaries. 
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Time-window: 5 = 15 days 
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Fig 1. (a) Number of days of terrorist activity, and (h) Total number of attacks in a 5-day time-window as 
a function of time for FARC (5 — 15 days). The key geopolitical events in this period are also marked. 



2.1. Qualitative Features of Activity Profile. A typical example of the activity profile 
is presented in Fig. 1 where the number of days of terrorist activity in a (5 = 15 day 
time-window and the total number of attacks within the same time-window are plotted 
as a function of time. The focus of this example is Fuerzas Armadas Revolucionarias de 
Colombia (FARC), the subject of our focus in Section 5.1. The data for Fig. 1 is obtained 
from the RDWTI and corresponds to incidents over the nine-year period from 1998 to 
2006. From Fig. 1 and a careful study of the activity profile of many terrorist groups from 
similar databases, we highlight some of the important features of terrorism data that impact 
modeling: 

• Temporal Ambiguity: The exact instance (time) of occurrence of a terrorism inci- 
dent is hard to pinpoint in reality. This is because accounts of most terrorist events 
are from third-party sources. Thus the granularity of incident reportage (that is, the 
time-scale on which incidents are reported) that is most relevant is discrete, typically 
days. 

• Attributional Ambiguity: Further, in many of the databases, there exists an am- 
biguity in attributing a certain terrorism incident to a specific group when multiple 
groups contest on the same geographical turf. Some of this ambiguity can be resolved 
by attributing an incident to a specific group based on the attack signature (attack 
target type, operational details, strategies involved, etc.). However, this is an intensive 
manual exercise and there is necessarily a certain ambiguity in this resolution. 

• Data Sparsity: Despite the recent surge in media attention on trans-national terror- 
ist activities and insurgencies, terrorism incidents are "rare" even for some of the most 
active terrorist groups. For example, the data in Fig. 1 corresponds to 604 incidents 
over a nine-year period leading to an average of ~ 1.29 incidents per week. While a 
case can be made that these databases report only a subset of the true activity, the 
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fact that significant amount of resources have to be invested by the terrorist group 
for every new incident acts as a natural dampener toward more attacks. 

These three features make a strong case for simphstic models for the activity profile of 
terrorist groups that can be described with few parameters and are motivated by few 
hypotheses, in contrast to elaborate and expansive models of many parameters. Further, 
any model should be robust to small number of errors in terms of missing or mislabeled 
data. 

2.2. Prior Work. The typical observable of a terrorist group is its activity profile with 
the observations being random point occurrences corresponding to terrorism incidents per- 
petrated by the group over a given time-period of interest. The previous discussion showed 
why a model for terrorism cannot be developed in continuous-time. A point process model 
in discrete-time is thus the most general framework (Cox and Isham, 1980) to model the 
activity profile. 

Let the first and last day of the time-period of interest be denoted as Day 1 and Day 
M, respectively. Let Mj denote the number of terrorism incidents on the ith day of ob- 
servation, i = 1, • • • ,AA. Note that Mj can take values from the set {0, 1,2, •••} with 
Mi = corresponding to no terrorist activity on the ith day of observation. On the 
other hand, there could be multiple terrorism incidents corresponding to independent at- 
tacks/targets on a given day refiecting the high level of coordination between various sub- 
groups of the group. Let 'Hi denote the history of the group's activity till day i. That is, 
Tii = {Ml, • • • , Mi} , i = 1,2, ■ ■ ■ ,M with T-Lq = 0. The point process model is complete if 
P {Mi = r\7ii^i) is specified as a function of Tii-i for alH = 1, • • • , A/" and r = 0, 1, 2, • • • . 

We noted in Section 1 that prior work on models for the activity profile fall under one 
of three categories. In the time-series techniques pioneered by Enders and Sandler (2000), 
a non-linear trend component, a seasonality (cyclic) component, and a stationary compo- 
nent are fitted to the time-series data of worldwide terrorism incidents. In particular, the 
following model-fit is proposed for {Mj}: 



where Ag denotes the period corresponding to the qth quarter in the period of observation 
and {n, oq, • • • , an, /?, ui, (j), Var(/ig)} are constants to be optimized over an appropriately 
chosen parameter space. This modeling effort results in an eight parameter model (4 for 
the trend component, 3 for the seasonality component, and a variance parameter for the 
stationary component) which is then used to identify a rough 4 and 1/2-year cycle in 
terrorism events. Further, a non-linear trend and seasonality component ensures that trends 
in terrorism cannot be predicted thus explaining the observed boom and bust cycles in 
terrorist activity. 

In (Dugan, LaFree and Piquero, 2005), the hazard function Haz(t) of the time to the 



n 




j=0 
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next hijacking (denoted as the random variable T) is modeled as 

log(Haz(i)) = log(Hazo) + /3i • Policies + /32 • Major Purpose + /33 • Context, 

Haz(t) = lim , 

^ ^ At-^O At 

where Hazo is a baseline hazard function, and {/3j} are optimized from a certain model 
class. On the other hand, the hurdle model of Porter and White (2012) is described as 

r ^~iB,+SE,{H^-i)) ^ 

P {Mi = rlTii-i) = 1 r^l . l']^ _ ^-{B,+SE,{ni-i))^ ^ I 

where s £ (l,oo) is an appropriately chosen parameter of the Zipf^ distribution, C{s) = 
is the Riemann-zeta function, Bi is a baseline process, and SEi{-) is the self- 
exciting component given as 

j:j<i,Mj>0 

for another appropriate choice of decay function g{-) and influence parameters {aj}. Note 
that the Zipf distribution falls in the class of power-law distributions (see Supplementary 
Part A) and is also explored in (Clauset, Young and Gleditsch, 2007). Porter and White 
(2012) study model-fits from a class described by eight parameters and show that a four 
parameter model optimizes an AIC metric for terrorism data from Indonesia/Timor-Leste 
over the period from 1994 to 2007. This model is shown to accurately capture terrorism 
data (especially the extreme outliers such as days with 36, 11, and 10 attacks). 

While the above set of models reasonably accurately capture terrorism data, they are 
cumbersome with too many parameters to be estimated. Further, the choice of metrics such 
as AIC or BIC lead to the best parsimonious model-fit from a certain class of models for 
observed terrorism data. Nevertheless, it is more reasonable to assume that the cause of 
an increased frequency of attacks is due to some changes in the group's tactics (a hidden 
state), rather than the fact that the group has already carried out attacks previously. Thus, 
activity modeling of terrorist groups is better motivated via a HMM. 

3. Proposed Model for the Activity Profile. Toward this direction of building a 
HMM for the activity profile, we make the following simplifying hypotheses in this work: 

• Hypothesis 1: The activity profile of a terrorist group {Mj, i = 1, • • • , AA} depends 
only on certain states {Si) in the sense that the current activity is conditionally 
independent of the past activity/history of the group given the current state: 

P {M,\ni-i, S,) = ?{Mi\Si), i = l,2,-- - . 



^This distribution is sometimes referred to as discrete Pareto. 
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In other words, these states completely capture the facets from the past history of the 
group in determining the current state of activity. While attaching specific meanings 
to Si is not the focus of this paper, an example in this direction is the postulate 
by Cragin and Daly (2004) that the Intentions and the Capabilities of a group could 
serve as these states. 

• Hypothesis 2: It is clear that the dynamics of terrorism is well-understood if the 
underlying states {Si} are known. However, in reality, Si cannot be observed directly 
and we can only make indirect inferences about it by observing {Mj,j = 1, • • • 
To allow inferencing of Si, we propose a simple d-state model that captures the 
dynamics of the group's latent states over time. That is, G {0, 1, • • • , d — 1} with 
each distinct value corresponding to a different level in the underlying attributes of the 
group. Further, we penalize frequent state transitions in the terrorist group dynamics 
by constraining Si to be fixed over a block (time-window) of 6 days, where 6 is chosen 
appropriately based on the group dynamics. 
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Fig 2. A simple d = 2 state HMM for the activity of a terrorist group. 



A typical illustration of this framework with d = 2 is provided in Fig. 2 where the 
state over the nth time-window (A^, n = 1, 2, • • • , K) corresponding to A„ = {(n — 1)6 + 
1, • • • , n6} and K = \^~\ is given as 

= Sn, Sn e {0, 1}. 

In the Inactive state (s„ = 0), the underlying Mj form a low- "rate" point process, whereas in 
the Active state (s„ = 1), the Mi form a high- "rate" point process. Thus, a state transition 
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from Sn-i = to s„ = 1 indicates a spurt in the activity of the group, whereas an opposite 
change indicates a downfah in the activity. This evolution in the states of the group is 
modeled by a state transition probability matrix P = {Pij}, where 

p ^ [ 1 - Po Po 
[ qo 1 - go _ 

with 

Pij = P(Sn = j\Sn-l = i), ij e {0, 1}. 

In general, there exists a trade-off between accurate modeling of the group's latent attributes 
(larger d is better for this task) versus estimating more model (transition probability matrix) 
parameters (smaller d is better). While attention in the sequel will be restricted to the d = 2 
setting because of the implementation ease of the proposed approaches in this setting, these 
approaches can be easily extended to a general d-state setting. 

To model the observations in either state (that is, P{Mi\Si = j), j = 0,1), a geometric 
density can be motivated as follows. Consider a setting where the group orchestrates Mj 
attacks on the ith. day till the success of a certain short-term policy objective can be 
declared. Every additional attack contributes equally to the success of this objective and 
as long as the group's objective has not been met, attaining this objective in the future is 
not dependent on the past attacks. In other words, the group remains memoryless (or is 
oblivious) of its past activity and continues to attack with the same pattern as long as its 
objective remains unmet. A slight modification in the group dynamics that assumes that a 
group resistance/hurdle needs to be overcome before this modus operandi kicks in leads to 
a hurdle-based geometric model for {Mi}. 

While these strategies may best describe a certain group, other groups could adopt dif- 
ferent strategies. In fact, strategies could also change with time. Thus, a certain model 
could make more sense for a given terrorist group than other models. In our subsequent 
study, we consider the following one-parameter models with support on the non-negative 
integers: Poisson, shifted Zipf, and geometric. We also consider the following two-parameter 
models: Polya^, (non-self-exciting) hurdle-based Zipf, and hurdle-based geometric. Of these 
six models, the geometric and the hurdle-based geometric allow simple recursions for esti- 
mates of model parameter(s) via the Baum- Welch algorithm, while the shifted Zipf and the 
hurdle-based Zipf distributions allow heavy tails (see Supplementary Part A for details). 
Further details on these models such as the associated probability density function, log- 
likelihood function, Maximum Likelihood (ML) estimate of the model parameters, and a 
formula for the AIC are also provided in Supplementary Part A. The study of fits of these 
models to specific terrorist groups is undertaken in Section 6.2. 

4. Detecting Spurts and Downfalls in Activity Profile. We are interested in 
solving the non-causal inference problem: 

= arg max P (Sj = s„, i G A„|{M,-, j = 1, • • • , A/"}) . 

n=l,-,K su-,SK&{0,l}'^ 
^Also referred to as negative binomial with positive real parameter. 
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In particular, we are interested in identifying state transitions that correspond to either a 
spurt or a downfah in activity. For this, note that while the parameters associated with the 
model proposed in Section 3 can be learned from terrorism data, there is often an interest 
in approaches that are independent of these parameters, and hence not sensitive to the 
parameter estimation algorithms or the length of the training-period. Thus, we propose 
two approaches to address (4.1): the first strategy is independent of the model parameters, 
whereas the second strategy exploits the underlying HMM structure. 

4.1. Model- Independent Strategy. We propose an algorithm for spurt detection based 
on the Exponential Weighted Moving-Average (EWMA) filter. The EWMA approach was 
first introduced by Roberts (1959) for (continuously) tracking and detecting a change in 
the mean of a sequence of observations. Here, the test-statistic (Rn) is a first-order autore- 
gressive version of the observation process (Zn) to be tracked with smoothing effected by 
an experimentally chosen parameter (A): 

Rn = {l- X)Rn-l + XZn, n > 1 

and -Ro = 0. The test-statistic is tested continuously against a threshold and change is 
declared at the first instant the test-statistic exceeds the threshold: 

(4.2) TEWMA = mf {n > 1 : ii„ > A^} . 

A^ is chosen to ensure that the average run length (ARL) to a false alarm is at least 7. Small 
values of A usually work best in changepoint detection as they smoothen small changes and 
enhance big changes (Srivastava and Wu, 1997). 

Since we are interested in tracking a change in the latent attributes, we focus on an 
attribute that reflects the resilience of the group, and another that reflects the level of 
coordination in the group (Santos, 2011; Lindberg, 2010). In particular, the ability of a group 
to sustain terrorist activity over a number of days reflects its capacity to rejuvenate itself 
from asset (manpower, material, and skill-set) losses and to modify strategies and plans to 
counter establishment actions. On the other hand, the ability of the group to launch multiple 
attacks over a given time-period reflects its capacity to coordinate these assets necessary for 
simultaneous action often over a wide geographic area. That is, mathematically speaking, 
the focus is on: i) X„, the number of days of terrorist activity, and ii) Yn, the total number 
of attacks, both within the (5-day time-window A„: 

X„ = ^ II {{A'U > 0}) 

for n = 1, 2, • • • and where Il(-) denotes the indicator function of the set under consideration. 
Note that Yn/5 is the average number of attacks per day and thus 5^ is a reflection of the 
intensity of attacks launched by the group. In general, X„ is more indicative of resilience 
in the group, whereas Yn captures the level of coordination better. 
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Specializing the EWMA approach of (4.2) to spurt detection in the activity profile of a 
terrorist group with Xn and Yn as observations, two parameters {Ai, A2} G [0, 1] are chosen 
appropriately and used to update the variables Ri^n and R2,n as follows: 

Ri,n = (1 — Ai)iii,„_i + XlXn 

R2,n = (1 — A2)-R2,n-1 + A2l^ 

for n > 1 with Ri^q = = -R2,o- The best choices of smoothing parameters Ai and A2 
for changepoint detection are obtained experimentally/numerically. We propose^ three 
stopping-times for declaring change: one based on Ri^m another based on i?2,ni and the 
third on a weighted combination (with weights a and \/l — a'^, a G [0, 1]) of the two 
test-statistics: 

Ti = inf {n > 1 : Ri^n > Ai} 
T2 = inf {n > 1 : i?2,n > ^2} 
^weighted = inf |n > 1 : + \/l - a'^R2,n > ^} , 

where the thresholds Ai,A2, and A are chosen to meet the corresponding ARL constraints. 
While design of the threshold (s) requires further study, experimental studies suggest that 
a threshold of the form 



{^1, A2, ^} = 0(log(7)) 

ensures that {ARL(ri), ARL(t2), ARL(Tvveighted)} = ^il)- Since Tweighted combines the in- 
formation contained in both {^n} and {^}, it should empirically be a better test than 
both Ti and T2. Nevertheless, all the three tests could be of potential utility depending on 
the nature of the terrorist group. 

4.2. Model-Based Strategy. We are now interested in model-based approaches for spurt 
detection. For this, note that the rare nature of terrorism suggests that most terrorist groups 
tend to be in an Inactive state for far longer than in an Active state. Thus, it is reasonable'^ 
to assume that Si = for long stretches of time and 70 ^ where 

(4.3) Ap(^ >0|5, =j), j = 0,l. 

Over such a long stretch where Si = 0, an elementary consequence of (4.3) is that Xn is a 
binomial random variable with parameters 6 and 70: 

P(^n = k)= Q ■ i^o)' • (1 - 70)'-' . 

^Note that other combinations such as the max statistic, min statistic, etc., can also be considered. We 
will not get into these details here. 

^Note that in (4.3), we have not made any specific assumptions on the distribution of Mi. In fact, we 
have only labeled the quantity in (4.3) as 7j. 
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If 6 is sufficiently large (typical values used in subsequent case-studies are S = 15 to 30 
days) so that the binning/edge effects can be neglected, Xn can be well-approximated 
by a Poisson random variable with rate parameter 5^o. In fact, we have the following 
bound (Teerapabolarn, 2012, Cor. 3.2) on the approximability of the binomial distribution 
by Poisson for A; = 0, • • • ,6: 

(4.4) I PBin (Xn = k)- Ppoi [Xn = k)\ < uAn {l - e"''^" , ^ j . 70 ~ 0. 

Equivalently, let T^, k = 1, 2, • • • denote the time to the feth day of terrorist activity (with Tq 
set to Tq = 0). Then, AT^ = Tk — Tk-i denotes the number of days of wait to the subsequent 
day of terrorist activity (inter-arrival duration) and AT^. is approximately exponential with 
mean I/70. While a similar reasoning suggests that AT^ in the Active state is exponential 
with mean I/71 , this fit is bound to be good only in the first-order sense because a terrorist 
group is expected to stay in the Active state for relatively shorter durations and 71 S> 70- 
Rephrasing the above conclusions, a discrete-time Poisson process model is a good model 
for the days of terrorist activity, especially in the Inactive state. 

Under the same assumptions (as above), in the Inactive state, Yn can be rewritten as 

Yn= Mi 

where An C A„ is the set of days of activity in A„ with \An\ = Xn- Thus, Yn can be approx- 
imated as a compound Poisson random variable (Cox and Isham, 1980) whose density can 
be expressed in terms of the density function of Mj. For example, if Mi is independent and 
identically distributed (i.i.d.) as geometric with P {Mi = k\Si = 0) = {1 — 70) • (70)^^5 k > 0, 
a straightforward computation (see Supplementary Part A) shows that 

P{Yn = r) = (l-7o)'-(7o)''- ('^^^"^), r>0. 

Similarly, the joint density of (Xn, Yn) can be written as 

PiXn = k,Yn = r) = (l-7o)'(7o)^- Q • (^ij^), r> A;, 

where the condition r > k ensures that at least one attack occurs on a day of activity. With 
the more general hurdle-based geometric model, where 

P{Mi = k\S, = 0) = i ,^ ^\ ? 
^ ' ^ I 70(1 - /uo) • (/io)''-^ k>\, 

the joint density is given as 

P{Xn = k,Yn = r)= Q (^^ - . (1 - 70)^-^X70)' • (1 - /^o)'(/io)^-', r > k. 
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Replacement of 70 and fiQ with 71 and /.ii in the Active state works subject to the same 
issues/constraints as stated earlier. 

While more detailed changepoint detection strategies such as the Cumulative Sum (CUSUM) 
or the Shiryaev-Roberts (SR) test (Tartakovsky and Veeravalli, 2005) can be constructed 
by using the statistics of X„ and Yn, we now propose an alternate approach to estimate the 
hidden states with the observations being {Yn}, and the joint sequence {{Xn, Yn)} 

under different modeling assumptions on {Mi}. For this, we first apply the classical HMM 
formulation (Rabiner, 1989) where the Baum- Welch algorithm is used to learn the pa- 
rameters that determine the density function of the observations. For the Baum- Welch 
algorithm to converge to a local maximum (with respect to the log-likelihood function) in 
the parameter space, a sufficient condition is that the density function of the observation 
be log-concave (Rabiner, 1989, Sec. IVA, p. 267). In Supplementary Part A, under the i.i.d. 
geometric and hurdle-based geometric models for {Mi}, it is established that all the three 
density functions are log-concave and hence also robust to initial estimates of the param- 
eters. Further, an iterative update for the parameter estimates is also established under 
these two models. With the converged Baum- Welch parameter estimates as the initializa- 
tion, the Viterbi algorithm is then used to estimate the most probable state sequence given 
the observations. The output of the Viterbi algorithm is a state estimate for the period of 
interest 



A state estimate of 1 indicates that the group is Active over the period of interest whereas 
an estimate of indicates that the group is Inactive. Transition between states indicates 
spurt/downfall in the activity. 

5. Case-Studies. We now undertake two case-studies on the fit of a discrete-time 
Poisson point process model for the days of terrorist activity. For this, we classify terrorist 
activities as reported in the RDWTI and GTD that catalogue activities by different groups 
across the world (RDWTI; LaFree and Dugan, 2007). 

5.1. FARC. We start with FARC, one of the most-feared and adaptive terrorist groups 
in the Americas. FARC is a Marxist-Leninist (pro-leftist) terrorist group active over a large 
geographical spread in Colombia and its neighborhood. FARC has a clearly laid out ideology 
that motivates and enables anti-government (and in general, anti-establishment) activities. 
We study the activities of FARC over the nine-year time-period from 1998 to 2006. This 
period covers a total of 604 terrorist incidents in the RDWTI with a yearly break-down 
of 44 incidents in 1998, 18 in 1999, 45 in 2000, 27 in 2001, 217 in 2002, 57 in 2003, 33 in 
2004, 66 in 2005, and 97 in 2006, respectively. The reasons why FARC and the 1998 to 2006 
time-period have been chosen for our study are provided in Supplementary Part B. 

As explained in Section 4, the activity profile of FARC can be modeled as a discrete-time 
sampled Poisson point process. In Fig. 3, we test the fit of this model by studying: i) the 
Quantile-Quantile (Q-Q) plots comparing the sample inter-arrival duration between consec- 




'Sn G {0, 1} for all i G A„ and n = 1, • • • , 



K 
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Q-Q plot of inter-arrival duration (Period 1 : 1 998-2001 ) 



Q-Q plot of inter-arrival duration (Period 2: 2002) 



Parameter 7 0.0857 





y Parameter 7 r; 0.4185 




+ 



(a) 



Q-Q plot of number of days with attacks [Period 1 : 1 998-2001 ) 



Inter-arrival duration 

(b) 

Q-Q plot of number of days witti attacks (Period 2: 2002) 



Parameter 7^ k, 5.14 



Parameter 7(5 ^ 25.11 



Number of days with attacks 

(c) 



16 18 20 22 24 26 28 30 

Number of days with attacks 



(d) 



Fig 3. Q-Q plots of duration between days of terrorist activity with respect to a theoretical exponential 
random variable for the period from: (a) 1998 to 2001 corresponding to normal terrorist activity, and (b) 
2002 corresponding to a spurt in terrorist activity. Q-Q plots of number of days of terrorist activity over a 
(5 = 60 day time-window with respect to a theoretical Poisson random variable for the same two scenarios 
are presented in (c) and (d). 

utive days of terrorist activity and an exponential random variable with an appropriately- 
defined rate parameter (7), and ii) the Q-Q plots comparing the nmnber of days of terrorist 
activity over a time-window of 5 days and a Poisson random variable with parameter (^7. 
For both sets of Q-Q plots, we consider two scenarios: the first (the 1998 to 2001 period) 
corresponding to a period of "normal" terrorist activity, and the second (the 2002 period) 
corresponding to a spurt in terrorist activity. 

Figs. 3(a) and (b) compare the Q-Q plots of the sample inter-arrival duration under 
these two scenarios with an exponential random variable. The rate parameter used for the 
exponential is 



(5.1) 



1 



7 
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Time-window: 6=15 days 



20 



Time-window: 5=15 days 




FARC data 



,R ,1 =0.05 



50 



100 



150 



200 



Index of time-window 



Index of time-window 



(a) 



(b) 



Fig 4. (a) Tracking properties of R2,n in response to changes in Y„ as a function of the smoothing parameter, 
(b) Performance of the three EWMA tests in spurt detection for FARC data (5 = 15 days is used). 

where E[ATfc] is the sample mean of the inter-arrival duration over the considered period. 
Prom Fig. 3, we note that both in periods of normal as well as a spurt in activity, the 
Q-Q plot is nearly linear with a few sample outliers in the tails. These outliers indicate 
that an exponential model for inter-arrival duration is not accurate because of the heavy 
tails. Nevertheless, to a first-order, an exponential random variable serves as a good fit for 
the inter-arrival durations. Our numerical study leads to the following estimates for 7,: 
(a) 7 0.0857, (b) 7 0.4185, suggesting that a spurt in activity is associated with an 
increase in the rate parameter. Similarly, in Figs. 3(c) and (d), we compare the Q-Q plots 
of the number of days of terrorist activity over a (5 = 60 day time-window under the same 
scenarios (as above) with a theoretical Poisson random variable of parameter 5^ = 60 7. 
While we observe some outliers in the tail quantiles, a Poisson random variable seems to 
be a good first-order fit for the number of days of terrorist activity. 

We now study the performance of the proposed EWMA approach with the FARC data. 
In the first study, presented in Fig. 4(a), we study how the tracking properties of R2,n 
(to changes in y„) behave with different values of the smoothing parameter A2. The true 
behavior of is also provided as a benchmark to compare the behaviors of i?2,n- In 
general, small values of A2 smoothen big changes in Yn considerably, whereas large values 
of A2 ensure that the evolution of R2,n is vulnerable to even small changes in 1^. As can be 
seen from the plot, a value of A2 = 0.10 trades-off the phenomena of smoothing out of small 
changes and enhancing of big changes, and thus renders the test-statistic R2,n effective from 
a changepoint detection perspective. 

In Fig. 4(b), we study the performance of the three EWMA tests proposed in this work. 
We plot the test-statistics: Ri^n with Ai = 0.05 for ri, i?2,n with A2 = 0.10 for r2, and 
tt-Ri,n + \/l — o?R2,n with a = 0.25 for Tweighted- The threshold is designed as {Ai, A2, A} = 
31og(7) for 7 = 10. From Fig. 4(b), we see that an appropriate weighted combination of 
the metric that captures resilience and the level of coordination in the group performs 
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better than either test-statistic taken separately (with the same threshold for all the three 
tests). While with the FARC data, the weighted sum performs only marginally better 
than the resilience-based metric, in general, we expect Tweighted to significantly improve 
the performance over either ri or T2- But more importantly, the EWMA approach detects 
only persistent changes or changes that last for a sufficiently long duration so that the 
changepoint detection approach can work accurately. In other words, the major spurts in 
FARC activity are detected, while the approach cannot detect the minor spurts. This is 
because the approach does not incorporate the statistical information of or {1^}. 

Table 1 

State classification of FARC with the geometric and hurdle-based geometric models and {{Xn, Yn)} as 
observations for different time-window periods (S ) 



S (in 


Parameters 


Length of 


No. of Active 


Fractional activity 


days) 


learned 


training-set {N 


states (Nspurt 


y- ) 




70 


71 


time- windows) 


time- windows) 




1 


0.0924 


0.3597 


3286 


483 


0.1469 


5 


0.0919 


0.3584 


657 


140 


0.2130 


7 


0.0924 


0.3598 


469 


89 


0.1895 


10 


0.0911 


0.3568 


328 


73 


0.2221 


15 


0.0924 


0.3605 


219 


46 


0.2099 


25 


0.0908 


0.3592 


131 


26 


0.1977 


30 


0.0930 


0.3593 


109 


20 


0.1825 



5 (in 

days) 



Parameters 
learned 



70 



fJ-o 



71 



Length of 
training-set (A*' 

time- windows) 



No. of Active 
states (Nspurt 
time- windows) 



Fractional 
activity 

t - —AT 



1 
5 

7 
10 
15 
25 
30 



0.0950 
0.0942 
0.0949 
0.0936 
0.0958 
0.0934 
0.0954 



0.0752 
0.0730 
0.0742 
0.0724 
0.0759 
0.0738 
0.0794 



0.3982 
0.3934 
0.3956 
0.3891 
0.4019 
0.3957 
0.3966 



0.3083 
0.3066 
0.3081 
0.3044 
0.3103 
0.3046 
0.3073 



3286 
657 
469 
328 
219 
131 
109 



484 
140 
87 
71 
42 
26 
20 



0.1472 
0.2130 
0.1853 
0.2160 
0.1917 
0.1977 
0.1825 



To overcome this deficiency, as explained in Section 4.2, a geometric model is assumed 
for {Mi} and the Baum-Welch algorithm is used to learn the underlying 7j with {^n}, 
{1^}, and {{Xn, Yn)} over a given 5-day time-window as training data. Specifically, Ta- 
ble 1 summarizes the parameter estimates for different 5 values when {{Xn, Yn)} is used 
as a training-set. It is to be noted that the learned parameter values remain reasonably 
stable across a large range of 5 (1 to 30 days) values. The performance of the Baum-Welch 
algorithm is also robust to both the length of the training-set as well as the initialization. 
Further, the parameter values also remain essentially independent of whether {1^}, 
or {{Xn, Yn)} is used for training. For example, with 5 = 15 and {{Xn, Yn)} as training 
data, the parameter values learned are 70 = 0.0924 and 71 = 0.3605, whereas with {Xn}, 
these values are 70 = 0.0941 and 71 = 0.3861. This observation is not entirely surprising 
since {Mi} is assumed to come from a one parameter model family and conditioned on 
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one of the two variables (X„ or 1^), the other variable adds no significant new information 
about the model parameter. 



Time-window: 5 = 15 days 



Time-window: 5=15 days 





index of time-window 

(a) 

Time-window: S = 15 days 



Index of time-window 

(b) 

Time- window: 5=15 days 





Index of time-window 



Index of time-window 



(d) 



Fig 5. State classification via Viterbi algorithm with the geometric model for {Mi}; observations being a) 
{Xn}, b) {Yn\, and c) {(X„, Yn)} with parameters learned by Baum-Welch algorithm, d) State classification 
with the hurdle-based geometric model for the observation sequence {(X„, Yn)}. 

The converged Baum-Welch parameter estimates are then used to initialize the Viterbi 
algorithm to obtain the most probable state sequence for FARC. Figs. 5(a)-(c) illustrate 
the state classification for 5 = 15 with {1^}, and {{Xn, Yn)} as observations. As 

can be seen from Fig. 5, in a clear departure from changepoint detection with the EWMA 
approach, the state classification approach detects even small and non-persistent changes. 
Further, the Viterbi algorithm declares 51 and 46 of the 219 time-windows as Active with 
{Xn} and {Yn}., respectively, whereas the joint sequence {[Xn-, Yn)} results in the same 
classification as {5^,}- Table 1 summarizes the number of time- windows classified as Active 
for different 5 values with {{Xn, Yn)} as observations. This study also suggests that 5 = 10 
to 15 with the HMM approach optimally trades-off the twin objectives of detecting minor 
spurts in the activity profile of FARC (larger Ngpurt) as well as minimizing the number of 
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Active state classifications that require further attention (smaller fractional activity f). 

We now consider a more general two-parameter hurdle-based geometric model for {Mj} 
that potentially allows the joint sequence {(X„, Yn)} to result in better inferencing on the 
states than either or {1^}. As before, the Baum- Welch algorithm is used to learn the 

underlying parameters with different values of 6 and {{Xn, Yn)} as training data. Table 1 
summarizes these parameter estimates and as was the case earlier, it can be seen that 
the estimates remain reasonably stable across 6. The parameter estimates are then used 
with the Viterbi algorithm to infer the most probable state sequence (see Fig. 5(d) for state 
classification in the 6 = 15 setting). While Fig. 5(d) and Table 1 show that the classification 
with the hurdle-based model agrees with the simpler geometric model for many 6 values, 
in the setting of interest {5 = 7 to 15 days), the hurdle-based model is more conservative in 
declaring an Active state. The four time-windows of disagreement between the hurdle-based 
geometric and geometric models for 5 = 15 correspond to the boundary of minor spurts 
(time- windows 11, 64, 191, and 204) where the hurdle-based model is more conservative in 
declaring an Active state, whereas the geometric model is trigger-happy and overcautious. 

To provide further intuition on the classification behind the Viterbi algorithm, we now 
propose an ad hoc approach to classify states, the outcome of which mirrors the outcome 
of the Viterbi algorithm. In this approach, state classification is done in two steps. In the 
first step, {AT^} are used as observations with the Viterbi algorithm to result in a state 
estimate for the days of activity 



In the second step, the following re-estimation formula provides an estimate of the states 
over the (5-day time-window: 



Note that the above formula is well-defined as it needs Sj only for the days of activity since 
Mi = otherwise. Further, r/ is a threshold that is chosen appropriately (typical values 
used in our study are rj = 0.5,0.75, etc). 

Since the Viterbi algorithm in the first step uses only {AT^} as observations, the fre- 
quency of attacks perpetrated by the group (X„ or the group's resilience) is the most 
important factor that determines the initial state assignment {sk}- In fact, the number 
of attacks on a day {Yn or the group's level of coordination) has no impact on {s^}- To 
compensate for not using this information, the re-assignment procedure in (5.2) is biased to 
the Active state if more attacks happen in a time-window. For example, if Mj = m > 1 for 
all i G A„, Sn is declared 1 only if an rj fraction of the days of activity in A„ are declared 
as Active in the first step. Similarly, if there exists a single day of heightened activity in A„ 
such that Mi » Mj for some i and all j / i, Sn is declared 1 provided ?i = 1 in the first step. 
In other words, the state declaration combines both the facets that are responsible 
for a spurt in the activity profile: an increased frequency of attacks, or more attacks on a 



Sk G {0, 1} for all k : Mk > 



(5.2) 
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day. With 6 = 15 for the FARC dataset, the ad hoc approach declares as Active ah the 42 
time-windows classified by the Viterbi algorithm as Active. In addition, it also classifies 32 
other time-windows (that are declared by the Viterbi algorithm as Inactive) also as Active. 
Thus, while the proposed approach is trigger-happy relative to the Viterbi algorithm, it 
nevertheless provides a certain intuition on how the resilience and coordination aspects of 
the group could be combined in understanding the group dynamics. While this combination 
of the two facets of the group is ad hoc, the proposed approach opens up the exploration 
of a more systematic combination of these facets. This will be the subject of future work. 



Time-window; S = 25 days Time-window: S = 25 days 




50 100 150 200 50 100 150 200 

Index of time-window index of time-window 



(a) (b) 

Fig 6. (a) Number of days of terrorist activity, and (b) Total number of attacks in aS = 25 day time-window 
as a function of time for Shining Path. 

5.2. Shining Path. The second case-study is the activity profile of Sendero Luminoso 
(more popularly known by its English name of Shining Path), one of the major terrorist 
groups in Peru for the sixteen year time-period from 1981 to 1996. Shining Path was guided 
by the Maoist (pro-leftist) ideology with a strong inclination to attack targets that had been 
perceived as "interventionist" or "establishmentist." Thus, it had been primarily involved 
in guerilla warfare on such diverse sets of targets as non-Peruvians as well as coordinated 
attacks on: i) government installations, ii) embassies and foreign missions in Peru, iii) cul- 
tural centers of, and aid organizations funded by foreign countries, iv) multi-national as well 
as foreign-owned companies, etc. Further, Shining Path was involved in the drug trafficking 
and trans-shipment business as a means to fund its activities (World Drug Report, 2010). 
With a focus on the sixteen-year period between 1981 and 1986, the RDWTI reports^ a 
total of 163 incidents with a yearly breakdown of 10 incidents in 1981, 7 in 1982, 10 in 1983, 

^After much of this work had been completed, the authors learned that the GTD could provide a better 
dataset for Shining Path than RDWTI. In fact, the GTD reports at least 4000 attacks over the same time- 
period for Shining Path. Processing of this data is manually intensive and a future work on comparative 
lessons on different types of terrorist groups is planned to address this data. Thus, much of the ensuing 
study should be seen in the light of what is possible with the proposed algorithms rather than as a reflection 
of the quality of the dataset used in the study. 
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Q-Q plot of inter-arrival duration [Period 1 : 1 981 -1 9 



Q-Q piot of inter-arrival duration (Period 2: 1987-1990) 




Parameter 7 ^ 0.0211 



20 40 so so 100 190 140 160 180 200 

Inter-arrival duration 



" + Parameter 7 0.0344 



20 40 60 80 100 120 140 

Inter-arrival duration 



(a) 



(b) 



Q-Q plot of inter-arrival duration (Period 3: 1 991 ) 



Q-Q plot of inter-arrival duration (Period 4: 1992-1996) 



Parameter 7 ^ 0.0755 



Parameter 7 ^ 0.0158 
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Fig 7. Q-Q plots of duration between days of terrorist activity with respect to a theoretical exponential 
distribution for the period from: a) 1981 to 1986, b) 1987 to 1990, c) 1991, and d) 1992 to 1996 corresponding 
to different stages of evolution of Shining Path. 

7 in 1984, 3 in 1985, 12 in 1986, 19 in 1987, 7 in 1988, 18 in 1989, 10 in 1990, 31 in 1991, 
10 in 1992, 14 in 1993, 2 in 1994, 2 in 1995, and 1 in 1996. The choice of Shining Path and 
the 1981 to 1996 time-period are motivated in Supplementary Part B. 

In Fig. 6, we plot the number of days of terrorist activity and the total number of attacks 
over a (5 = 25 day time-window. In Fig. 7, we test the fit of inter-arrival duration between 
successive days of terrorism incidents with respect to an exponential random variable with 
parameter 7 estimated as in (5.1). As noted in Supplementary Part B, the evolution of 
Shining Path can be partitioned into four distinct phases. The inter-arrival duration in 
each phase can be distinctly modeled as an exponential random variable and Fig. 7 shows 
that this partitioning is reasonable. As in the case with FARC, the exponential random 
variable is a good first approximation as the tails are not well-modeled with this random 
variable. Our numerical study leads to the following estimates for 7, in the four phases: a) 
7 ^ 0.0211, b) 7 f« 0.0344, c) 7 0.0755, d) 7 0.0158. 
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Fig 8. (a) Tracking properties ofRi,n in response to changes in Xn as a function of the smoothing parameter, 
(b) Performance of the three EWMA tests in spurt detection for Shining Path data (5 = 25 days). 

As elucidated with the FARC dataset, the data corresponding to Shining Path is studied 
in the following experiments. In Fig. 8(a), the tracking properties of Ri^n for different values 
of Ai are presented along with the number of days of terrorist activity over a (5 = 25 day 
time-window. From this figure, we see that a small value of Ai such as Ai = 0.05 or 0.10 is 
best suited in changepoint detection. In Fig. 8(b), the test-statistic corresponding to ri, r2, 
and Tweighted are presented along with a threshold that is chosen appropriately. Using the 
HMM approach with the hurdle-based geometric model described in Section 4.2, the Baum- 
Welch algorithm results in parameter estimates as in Table 2 for different values of 5. State 
classification via the Viterbi algorithm using these estimates results in an Active/ Inactive 
classification for Shining Path, e.g.. Fig. 9 displays a typical classification for 5 = 25. It is 
important to note that while four distinct phases are identified in the evolution of Shining 
Path, only a.d = 2-state HMM is studied here. For the purpose of spurt/downfall detection, 
even this coarse and parsimonious model is sufficient. As can be seen with FARC data, even 
small and non-persistent changes are detected by the HMM approach, further confirming 
its usefulness. 

Table 2 

State classification of Shining Path with the hurdle-based geometric model and {{Xn, Y„)} as observations 

for different time-window periods (S) 



5 (in 

days) 



Parameters 
learned 



7o 



71 



Ml 



Length of 
training-set (A*' 

time- windows) 



No. of Active 
states (Nspurt 
time- windows) 



Fractional 
activity 

r Nspurt- 

^ - —AT 



20 0.0262 0.1026 0.1000 0.6667 268 

25 0.0264 0.1019 0.0800 0.6667 215 

28 0.0264 0.1019 0.0714 0.6667 192 

30 0.0262 0.1026 0.0667 0.6667 179 

32 0.0202 0.0703 0.1072 0.2255 168 



69 
35 
15 
14 
38 



0.2575 
0.1628 
0.0781 
0.0782 
0.2262 
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Time-window: 5 = 25 days 
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Fig 9. State classification with Shining Path data (5 = 25 days). 



6. Revisiting Some of the Modeling Issues. We now re-explore some of the mod- 
eling issues that shed further light on development of good models for the activity profile 
of terrorist groups. In this section, the primary focus is on the FARC dataset. 

6.1. Clustering of Attacks. A central premise in many work on terrorism modeling is 
that the current activity of a terrorist group is influenced by its past activity. One conse- 
quence of this premise is that the attacks perpetrated by the group are clustered (Midlarsky, 
1978; Midlarsky, Crenshaw and Yoshida, 1980). Ripley's K{-) function is a statistical tool 
for measuring the degree of clustering (aggregatedness) or inhibition (regularity) in a point 
process as a function of inter-point distance (Diggle, 2003). Specifically, if A is the intensity 
of the point process, \K{h) is the expected number of other points within a distance h of 
a randomly chosen point of the process: 

K{h) = — • E [Number of other points within distance /i of a randomly chosen point] . 
A 

Expressions for K{h) can be derived for a number of stationary point process models (Dixon, 
2002). For example, in the case of a one-dimensional/temporal point process that is com- 
pletely random (where points are distributed uniformly and independently in time), it can 
be shown that K{h) = 2h. A two-dimensional complete random spatial point process leads 

to K{h) = TTh^. 

In the context of an activity profile, K(h) can be estimated as 

(6.1) k{h) = J—j2Y.i(\t,-t,\<h) 

where ti is the ith day of activity, is the number of attacks in the observation period of A/" 
days (that is, [ti, • • • , tAr] = [Day 1, • • • , Day AA]), and A = ^ is an estimate of the intensity 
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(rate) of the process. Significant deviations of K(h) from 2h indicate that the hypothesis 
of complete randomness becomes untenable with observed data and more confidence is 
reposed on clustering (if K{h) ^ 2h) or inhibition (if K{h) <^ 2h). 

While the definition in (6.1) assumes a homogenous point process, extensions to an in- 
homogenous point process have also been proposed (Baddeley, Moller and Waagepetersen, 
2000; Veen and Schoenberg, 2006), where the point process is re- weighted by the reciprocal 
of the non-constant intensity function to offset the inhomogeneity. Further, to compensate 
for edge effects due to points outside the observation period being left out in the numer- 
ator of (6.1), various edge-correction estimators have also been proposed in the literature. 
Combining these two facets, we have the following estimator^ for K{h): 

where pi is the estimated probability of at least one attack on tj. In this work, we use 
an edge-correction factor Wij due to Ripley (Cressie, 1991, pp. 616-618) which reflects the 
proportion of the period centered at U and covering the jth day of activity that is included 
in the observation period: 

1, li ti + R<ti <tN - R 

\i ti> xn&yi{tM - R, ti+ R) 
^'~^2R^^ , a ti <m.m{tN - R, h + R) 
if tN - R<ti<ti + R 

where R = \ti — tj\. 

In Fig. 10(a), we plot K{h) — 2h computed as in (6.2) for the FARC dataset (with and 
without edge-correction) as a function of the inter-point distance h. In line with the obser- 
vation in Porter and White for the Indonesia/Timor-Leste dataset, the plot here indicates 
that the FARC data is also clustered since K(h) — 2/i 0, thus motivating the self-exciting 
hurdle model of Porter and White. The main theme of this work, however, is that clustering 
is essentially a reflection of state transitions and the activity sub-profile (sub-series) of the 
group conditioned on a given state value is not clustered. To test this hypothesis, we study 
the behavior of K{h) for the sub-series from the FARC dataset corresponding to the Active 
and Inactive states as classified by the methodology of Section 5.1. 

Since the latent states of the group transition back and forth, an Active sub-series is 
constructed by patching together the group's activity profile in the Active state with the 
jump random variable between any two disjoint pieces of the activity profile modeled as 
Poisson with parameter A = For this sub-series, K{h) is estimated using a formula 
analogous to (6.2), where pi and Wij are re-estimated for the sub-series, and M is replaced 
with J\f act- 

A/'act = niin ( Number of days in Active state. Length of Active sub — series 



^Note that Porter and White propose a one-sided estimator for K{h) and tliey compare K{h) with h 
(instead of 2h) to test for clustering/inhibition. 
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A similar estimation process yields K{h) for the Inactive sub-series. In contrast to the 
trends for the activity profile, the sub-series' of the activity profile indicate an inhibitory 
behavior (mild inhibition for the Active state and strong inhibition for the Inactive state). 
These conclusions should not be entirely surprising since very few attacks happen in the 
Inactive state. Thus, Fig. 10(a) prompts that the classical assumption of clustering of 
attacks needs to be revisited and the causality assumptions behind terrorism models studied 
more carefully across many groups/datasets. 

6.2. Model for Observation Densities. We now develop simple models for the observa- 
tion density under the two states. For this, we study the goodness-of-fit captured by the 
AIC for several models with support on the non-negative integers to describe data from 
FARC. In Table 3, we present the histogram of the number of days with £ (i = 0,1, ■ ■ ■ ) 
attacks per day for FARC data. Applying the state classification algorithm described in 
Section 4.2 with 6 = 15, FARC stays in the Inactive state for 2657 days and in the Active 
state for 630 days. Also, presented in the same table are the (rounded-off ) expected number 
of days with I attacks for the six models described in Supplementary Part A, along with 
the AIC for these model-fits. Table 4 presents the corresponding histograms for 6 = 10, 
with which FARC stays in the Inactive state for 2577 days and in the Active state for 710 
days. 

The ML parameter estimates for all the six models remain reasonably robust as 6 is 
varied, which is in conformity with the stability of the converged Baum- Welch parameter 
estimates with 6 (see Tables 1 and 2). Further, from Tables 3 and 4, in the Inactive state, 
it is seen that all the models except the shifted Zipf result in comparable fits. Specifically, 
the hurdle-based geometric model differs from the observed histogram in only one day 
and results in the second lowest AIC value. On the other hand, in the Active state, the 
hurdle-based geometric model produces the best fit with only the Polya model resulting 
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Table 3 

Histogram of observed number of attacks per day for FARC data with different model-fits, S = 15 days 



No. attacks 


Obs. 


Poisson 


Shifted 


Geomet. 


Polya 


Hurdle- 


Hurdle- 


(Inactive 






Zipf 






Based 


Based 


State) 












Zipf 


Geomet. 





2420 


2421 


2470 


2430 


2421 


2420 


2421 


1 


227 


225 


144 


207 


225 


229 


226 


2 


9 


11 


27 


18 


11 


7 


10 


3 


1 





8 


2 





1 





4 








4 














>4 








4 














AIC 




1690.34 


1772.81 


1696.74 


1692.32 


1692.58 


1691.86 


Parameter 




0.0933 


4.105 


0.0854 


fo = 24.4749, 


70 = 0.0892, 


^0 = 0.0444, 


Estimate 










yo = 0.0038 


yo = 5.10 


70 = 0.0892 


No. attacks 


Obs. 


Poisson 


Shifted 


Geomet. 


Polya 


Hurdle- 


Hurdle- 


(Active 






Zipf 






Based 


Based 


State) 












Zipf 


Geomet. 





384 


359 


455 


404 


389 


384 


384 


1 


174 


202 


87 


144 


160 


189 


171 


2 


46 


57 


33 


52 


56 


31 


52 


3 


19 


11 


16 


19 


17 


11 


16 


4 


4 


1 


9 


7 


6 


5 


5 


> 4 


3 





30 


4 


2 


10 


2 


AIC 




1313.88 


1416.88 


1291.73 


1288.85 


1308.09 


1287.11 


Parameter 




0.5651 


2.40 


0.3611 


ri = 1.4834, 


71 = 0.3905, 


Ail = 0.3090, 


Estimate 










yi = 0.2759 


yi = 2.61 


71 = 0.3905 



in a comparable fit. In this setting, the one-parameter models over-estimate either the tail 
or the days of no activity, while the hurdle-based Zipf produces a heavier tail than what 
the data exhibits. In fact, the poorest fit in either state is obtained with the shifted Zipf 
suggesting that a heavy tail may not always be necessary. In contrast, the Indonesia/Timor- 
Leste data studied in (Porter and White, 2012) exhibits several extreme values (e.g., days 
with 36, 11, and 10 attacks) and the authors observe that a self-exciting hurdle-based Zipf 
model captures the heavy tails much better than other models. The FARC dataset used 
here shows a maximum of 7 attacks per day, whereas the Shining Path^ dataset shows a 
maximum of 3 attacks per day. Even simple non-self-exciting models are enough to capture 
these datasets sufficiently accurately. 
This study suggests the following: 

• If parsimony of the model is of critical importance, the geometric distribution serves 
as the best one-parameter model with the Poisson/shifted Zipf models either under- 
estimating or over-estimating the number of days with no activity in the Active state. 

• If parsimony is not a critical issue and the data does not have (or has very few) 
extreme values, the hurdle-based geometric model serves as the best/near-best model 
in either state. 

^Note that the data from GTD on Shining Path, which is more richer, is not expected to conform to the 
trend from RDWTI. 
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Table 4 

Histogram of observed number of attacks per day for FARC data with different model-fits, 5 = 10 days 



No. attacks 


Obs. 


Poisson 


Shifted 


Geomet. 


Polya 


Hurdle- 


Hurdle- 


{Inactive 






Zipf 






Based 


Based 


State) 












Zipf 


Geomet. 





2379 


2376 


2420 


2385 


2380 


2379 


2379 


1 


188 


193 


124 


178 


188 


191 


188 


2 


10 


8 


22 


13 


9 


6 


9 


3 








6 


1 





1 


1 


4 








2 














> 4 








3 














AIC 




1478.87 


1537.97 


1481.36 


1480.56 


1484.02 


1480.77 


Parameter 




0.0807 


4.29 


0.0747 


fo = 5.0975, 


70 = 0.0768, 


Mo = 0.0481, 


Estimate 










yo = 0.0156 


yo = 4.915 


70 = 0.0768 


No. attacks 


Obs. 


Poisson 


Shifted 


Geomet. 


Polya 


Hurdle- 


Hurdle- 


[Active 






Zipf 






Based 


Based 


State) 












Zipf 


Geomet. 





425 


407 


513 


456 


437 


425 


425 


1 


213 


226 


97 


163 


187 


226 


206 


2 


45 


63 


37 


58 


61 


34 


57 


3 


20 


12 


18 


21 


18 


11 


16 


4 


4 


2 


11 


8 


5 


5 


4 


> 4 


3 





34 


4 


2 


9 


2 


AIC 




1452.09 


1599.89 


1444.86 


1435.32 


1443.56 


1430.33 


Parameter 




0.5577 


2.3950 


0.3580 


fi = 1.8427, 


71 = 0.4014, 


/2i = 0.2803, 


Estimate 










yi = 0.2323 


yi = 2.735 


71 = 0.4014 



• However, if the data has several extreme values (as seen in Porter and White), the 
self-exciting hurdle model offers the best model-fit albeit at the expense of learning 
several model parameters. 

6.3. Inter- Arrival Duration. We now study the efhcacy of the HMM framework by test- 
ing the goodness-of-fit of the theoretical exponential random variables with respect to the 
inter-arrival duration between days of terrorist activity in either state. To avoid estimating 
the rate parameters of the exponential random variables from the data (which complicates 
the hypothesis tests), we use the fact from (Seshadri, Csorgo and Stephens, 1969) that if 
2/1) ■■ ■ ^Vm are i.i.d. exponential random variables (with a given rate parameter), then 

(6.3) = ^l^i^, j = I,--- ,m 

are i.i.d. uniformly distributed in [0, 1]. We then use a Kolmogorov-Smirnov (KS) test to 
study the fit between the empirical cumulative distribution function (CDF) of Zj and the 
uniform CDF (Durbin, 1973). 

The KS test-statistic (denoted as KS,) and the critical value for the test (denoted as 
Ka) corresponding to a significance level a and computed using the standard asymptotic 
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formula are given as 

I 1 "* 

KS, = max — > 1 (2* < x) — x 
X I n, 

1=1 

. ./ziMl) 
V 

where n, is the number of samples and {2;*} are the transformed samples computed us- 
ing (6.3) in either state. The results of applying the KS test to the two states are presented 
in Table 5. From this table, it is clear that the samples in either state fit the theoretical 
exponential assumption very accurately with the exponential model rejected in the Active 
(and Inactive) state(s) if the significance level a exceeds a > 0.3763 (and a > 0.7905), 
respectively. 



Table 5 

KS test-statistic in the two states for FARC dataset 





Active 


Inactive 


Total number of samples 


245 


236 


No. of samples with ATk < P 


n^ = 234, /3 = 15 


m = 192, ^ = 20 


KS statistic 


0.0597 


0.0492 


p- value 


0.3763 


0.7905 



On the other hand, by studying terrorism data for FARC over the time-period 1998 
to 2005 from the MIPT database (a database comparable with RDWTI, but currently 
unavailable), Clauset and Gleditsch hypothesize that the inter-arrival duration between 
successive days of attacks "decreases consistently, albeit stochastically" with the cumulative 
number of events FARC has carried out - a measure of the group's experience. Our initial 
studies indicate that while Clauset and Gleditsch's hypothesis holds empirically true for 
k < 25 attacks that encompass the period Jan. 1 to June 27, 1998, it consistently increases 
through the subsequent period lasting till Mar. 10, 2000. Note that this is the precise time- 
period of increased U.S. funding to combat FARC and the drug economy through Plan 
Colombia (Haugaard, Isacson and Olson, 2005) and a mean increase in the time to the 
next day of activity indicates an impact of counter-terrorism efforts. The following period 
through Aug. 13, 2004 indicates a reversed trend of consistent decrease suggesting that 
the organizational dynamics of FARC had "adjusted" to the new reality of combat with 
the establishment. As seen earlier, such distinct changes in the organizational dynamics 
(associated with spurts and downfalls) are quickly identified by the approaches proposed 
in this work. 

6.4. Robustness of Proposed Approach to Missing Data. We now study the robustness 
of the proposed spurt detection approach in terms of state classification to attacks that are 
not available in the database. Toward this goal, we treat the FARC data from RDWTI over 
the 1998 to 2006 period as the baseline dataset and add one missing day of activity per 
year from the GTD in a sequential manner and revisit state classification with the enhanced 

imsart-aoas ver. 2012/04/10 file: HMM_terrorism_arxiv2.tex date: July 30, 2012 



HIDDEN MARKOV MODELS 



27 



dataset. Specifically, the fraction of missing data added in the jth (sequential) step is the 
ratio of the difference between new and old attacks and the baseline, and is defined as 

V'^ - M- 
Frac. Missing Data(j) ^ ^'=\r ' - 

where M/ is the number of attacks on the ith day with data addition. Applying the state 
estimation algorithm proposed in Section 4.2 to this new dataset, let s^^J^ G {0, 1} denote 
the estimated state value on the ith day (i = I,-- - The fractional change in state 

classification is then defined as 



Frac. State Classification Changes (j) 



EM |-~new _ -J. I 
1=1 Pi, j 



with Sj denoting the state classification on the ith. day with the baseline dataset. 

In Fig. 10(b), Frac. Missing Data(j) is plotted as a function of Frac. State Classification 
Changes(j). In general, combining terrorism information from two different databases with 
a clear dichotomy in terms of data-collection standards (criteria for inclusion and non- 
inclusion of events, source material used, etc.) can introduce a systematic bias in terrorism 
trends (LaFree and Dugan, 2009). Despite this anomaly, it is clear from Fig. 10(b) that the 
proposed approach is remarkably robust to a small amount of missing and/or mislabeled 
data. For example, the addition of an ~ 5% more data to the baseline dataset leads to 
essentially no changes in state classification with the baseline dataset. On the other extreme, 
big additions of even up to 65% more data result in only an ~ 10% mismatch in state 
classifications. 

7. Concluding Remarks. This work developed a HMM framework to model the 
activity profile of terrorist groups. Key to this development was the hypothesis that the 
current activity of the group can be captured completely by certain states/attributes of the 
group (such as its Intentions and its Capabilities) , instead of the entire past history of the 
group. In the simplest example of the proposed framework, the group's activity is captured 
by a d = 2 state HMM with the states reflecting a low state of activity (Inactive) and a high 
state of activity (Active), respectively. In either state, the days of activity are modeled as a 
discrete-time Poisson point process with a hurdle-based geometric model being a good-fit 
for the number of attacks per day. While more general models can be considered, even the 
simplest framework is sufficient for detecting spurts and downfalls in the activity profile of 
many groups of interest. 

For this purpose, two detection methodologies were proposed in this work. The first of 
these approaches was a stopping-time based on the EWMA filter that tracks the quantities 
of interest. This approach does not require knowledge of the underlying model and is hence 
advantageous. We showed that persistent changes can be detected with this approach. The 
second approach that requires an estimate of the model parameters is to estimate the 
most probable state sequence using the Viterbi algorithm. We showed that this approach 
detects not only non-persistent changes, but also has a low detection delay thus allowing 
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the identification of key geopolitical undercurrents (or events) that lead to sudden spurts 
in a group's activity. This early detection of a spurt can then be used for further goals in 
course of action analysis such as course prediction and policy framing to handle, manage 
and/or ameliorate the activities of the terrorist group. 

Fruitful directions to explore in the future include development of more refined models 
for the activity profile (such as hierarchical HMMs, switching HMMs, etc.,) that incorpo- 
rate heavy tails and extreme outliers commonly observed in terrorism data. A systematic 
comparison between the class of self-exciting models and HMMs in explaining the cluster- 
ing of attacks and a possible bridge between these two classes will also be of interest. In 
terms of inferencing, non-linear filtering approaches such as particle filters that use causal 
information on terrorist activity and that are computationally feasible will be of importance 
in practice. Given the intensive nature of data collection and processing that is common 
for studies of this nature, it would be of interest in developing broad trends and trade- 
offs in quantitative terrorism studies with a large set of groups from different ideological 
proclivities. 

APPENDIX A: INFORMATION ON MODELS FOR THE NUMBER OF ATTACKS 

PER DAY STUDIED IN THIS WORK 

Table 6 provides information on the density function P(Mj = k\Si = j), k > 0, j = 

n 

0, 1, the log- likelihood function with observations Xi , the ML estimate of the model 

parameter (s), and the AIC for the six models discussed in Section 3 of the main text. The 
subsequent sections derive: i) the density function of Yn and (X„, Yn) under the geometric 
and hurdle-based geometric assumptions on {Mj}, and ii) Baum- Welch updates for model 
parameter (s) estimates in these two settings. 

A.l. Models for {Mi}. Table 6 considers six models for {Mj} and lists relevant 
information such as the density function, ML estimate of the parameter with observations 

n 

Xi , AIC, etc. The fits of these models to specific groups such as FARC and Shining Path 

i=l 

are studied in the main text of the paper. 

It can be seen that the geometric and the Poisson distributions can be obtained as special 
cases of the Polya distribution: 



P61ya(r, y) = Geometric (1 — y) 



r=l 



Polya ( r. 



A 



Poisson(A). 



Further, a straightforward calculation shows that the complementary cumulative distribu- 
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Table 6 

Information on observation density models for state j, j £ {0, 1}, (Support: k £ {0, 1, ■ ■ ■}) with 

I ^ 

observations Xi\ 



Name 



Density 



Log-likelihood function (log(I/)) ML estimate 



AIC 



Poisson 



i=l 



2-21og(L) 



Shifted 
Zipf 



(i+fc) "J 



-nlog (C iVj)) -Vj -T, log(l + 

1=1 



Vj solves = 
Elog(l + xO 

i = l 



2-21og(i) 



Geometric 



(1-7.) -(7.)' 



nlog(l - 7j) + log(7j) ■ Yl Xi 



7j = 



1=1 



1 = 1 



2-21og(L)l 



Polya 



r{fc+i)-r{rj) 
Vj) 



TjuXogil - yj) - nlog(r(r-j)) + 

n n 

logte)- E^-.- Eiog(r(i. + i)) 



2=1 J=l 



+ Elog (r(x,+r,)) 
1=1 



fj and j/j solve 
Eq. 1 and 2 

below 



4-21og(L) 



Hurdle- 
based 

Zipf 



1-7. , if fc = 



7j -fc 



C{!/j) 



if fc > 1 



no log(l - 7j) - (?i - ?io) log (C(j/j)) 
+ (n - no) log(7j) - y^ J2 log(x-i), 

i=l 

n 

where no = E = 0}) 

1=1 



Vj 



1 Tin 

7j = 1 - if. 
solves 

Y:7=r° i°g(^.) 

n — 7iQ 



4-21og(i) 



Vj, 



Hurdlc- 
bascd 

Geometric 



1-7. , if fc = 
7j(1 -Mj)x 



(«)'"', if fc>l 



no log (1 - 7j) + (n - no) log(73) 
+ ( E 3;, - n + no ) log(Mj) 



1j 



1 _ lai 

n ' 
1 _ n-ng 



^! = 1 / 

-f (n - no) log(l - flj) 



i=l 



4-21og(L) 



Eq. 1: y = E E + m 

1=1 ' \i = l 



Eq. 2: -nlog(l - y) = E " , 

i=l 

where ^'(■) and r'(-) denote the derivatives of (^{x) and r(2;), respectively, 
and 1(.) denotes the indicator function of the set under consideration. 



tion functions (CCDF) for these six models are respectively given by: 



P{X > X 
P{X > x) 
P{X > x) 



Poisson 



Geometric 

P(X > x) 

Polya 



P{X > X 
P{X > X 



Hurdle— based Geometric 



(A, 



x\ 



1 — e ^ as X — )■ oo 



X '^^^ 



Shifted Zipf {yj - 1) • C(yj) 

(7j)^ as x — )• oo 



as X — >• oo 



iy^r-ix + rj-l) 



r,-l 



as X — >• oo 



7i 



Hurdle-based Zipf CiUj) ' iUj ~ 1) 



• x ^'^J as X — )• OO 



— • {fJ-jY as X — > oo. 
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The above list shows that, of the six models studied here, only the shifted Zipf and the 
hurdle-based Zipf accommodate heavy tails, whereas the other^ four models have exponen- 
tially decaying tails. 

A. 2. Density function of Yn and Yn) for the geometric model. We now 

derive the probability density function of given that {Mi} is i.i.d. geometric with param- 
eter 7j in state j (j = 0, 1) as in Table 6. Since Yn = probability of observing 

r attacks in the nth time-window with Mj attacks on the ith day is given by 

5 

(A.l) n - ^.OCt,)''^ = (1 - 7,)' • (7,)^'-^" = (1 - 7,)' • (7,)^ , 

i=l 

which is dependent on {Mi} only through ^ Mj. Thus, 

i 

P(y. = r) = (1 - 7,.)' • (7,r • N 
(A.2) = (1 - 7.)^ •(7.r + 

where N = {^^^^~^^ is the multinomial coefficient that corresponds to the number of ways 
in which r attacks can be distributed over 6 days. 
Alternately, P{Yn = r) for r > 1 can be derived as 

min(r,(5) 

P{Yn = r) = P{yn = r,Xn = k) 

k=l 
min(r,(5) 

= Y P{Xn = k)-P{Yn = r\Xn = k) 
k=l 
min(r,<5) , 

(A.3) = Yl {M-l3f-\l^f-P{Yn = r\Xn=k), 

k=i ^ ^ 

where the upper limit for k in the first line comes from noting that Yn > Xn and Xn < 5 
and the lower limit follows since r > 1, and P(A„ = A;) in the second line follows a standard 
binomial density. Note that PiXn = AXn = A;) for A; > 1 is equivalent to computing the 
probability of arranging r — k balls in k boxes since each attack day has (by definition) at 

^Note that the Polya decays exponentiaUy as a; oo. 
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least one attack. This probability is given by 

?{Yn = r\Xn = k) = p[^Nh,=r\{Mi,>{),l=l,--- ,k}^ 

where the second line follows the same logic as in (A.l) and (A. 2). Combining (A. 3) 
and (A.4), we get the following: 

mm{r,5) , , , , 

mm(r,<5) 

It can be seen that (A.5) is consistent with the computation in (A. 2) by noting that ('^^^~^) 
is the coefficient of x*" in the binomial expansion of {l + xY~^^~^ , whereas X]^"^'"''^^ (f) • i^^-k) 
is the coefficient of x'' in the expansion of (1 + xf ■ (1 + xf-^. The constraints I <k < 
min(r, 6) follow since Q < r — k < r — 1 and < /c < 5 in the binomial expansions. Similarly, 
we can also write the joint density as 

P[Yn = r,Xn = k) = {l- 7,)^(7i)^ • Q • (' I ^) , ^ > ^- 

A. 3. Derivation of Baum- Welch estimate for 7^. The first step in the Baum- 
Welch algorithm is the verification of the log-concavity of the density function. For this, note 
that Xn which is Poisson distributed satisfies the log-concavity condition straightforwardly 
(Juang and Rabiner, 1990, Sec. Ill) and thus, the Viterbi algorithm can be directly applied 
to Xn with 7j. For 1^, note that 

log (P(y„ = r)) = ,51og(l - 7,) + rlog(7,) + log (^(^'^ + ^ ^) ) , 

which can be easily checked to be concave in 7^ for all r > 0. Since the joint density 
of {Xn, Yn) has the same form as the density of in terms of the parameter 7^, the 
log-concavity condition also applies to the joint density. 

We now develop an expression for the estimate of 7j (denoted as 7j) when Baum- Welch 
algorithm (Rabiner, 1989) is applied to a training-set of N observations O = {On, n = 
1, • • • , N}. Let the corresponding hidden states be 5 = {Sn, n = 1, • • • , N}. The Baum 
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auxiliary function with current/initial estimate of HMM parameters A (as a function of the 
optimization variable A), Q{X,\), is given as: 



Q{X, A) = J^log (P{0, S\X)] ■ P{0, S\X) 



We proceed via the same approach elucidated in (Bilmes, 1998) where the component of 
the auxiliary function due to the observation density is expressed as: 



Q(A,A) 



Elog(P(0,cS|A))-P(0,5|A) 

Obs. density _ S 



Obs. density 



P(0|A) P(0|A) 

EElog(P(0„|5„, A)).P(0,5|A) 



5 n=l 



P(0|A) 

E log(P(0„|5„, X))-PiO,Sn\X) 

n=l 



P{0\X) 

N 1 

(A.6) = ^^log(P(0^|Sn = j, A)) ■ P{Sn=j\0, A) , 

=7nO) 

where the iterative update for 7n(j) follows from (Rabiner, 1989, Sec. IIIA and B). Setting 
the observations O = {Xn} and using the Poisson density model for the observations 
in (A.6), a straightforward optimization shows that 

/ * 7\ _ Yln=l ^nlnii) 

Ln=l7n(j) 

Similarly, using the density model for O = {Yn} from Section A. 2 in (A.6), we have the 
following estimate: 

(A.8) 7. = ^n=iynln{j) 

Recall that the ML estimate of jj from Table 6 with {Mi, i S A^} as observations is 



7i 



Thus, the Baum- Welch estimate in (A.8) can be understood as a weighted version of the 
ML estimate with the weights being induced by 7n(i)- 

Following the same approach as above, it can be seen that with O = {{Xn-, ^)}, the 
same expression as in (A.8) holds. In other words, with the geometric model for {Mi}, Xn 
does not add new information if Yn is known a priori and thus 7j does not change. This 
conclusion should not be surprising as the geometric density is a one-parameter model. 
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A. 4. Extension to hurdle-based geometric model. As noted in equation (4.4) of 
the main text, A„ is Poisson with parameter 6jj in state j where 

7j^P(M,, >0|5, =j), j = 0,l. 

Following the approach in Section A. 2, with {X„, n = 1, • • • , N} as the training-set, ■jj is 
estimated by the same formula as (A. 7) in the geometric case. Note that while {A^n} can 
be used to estimate 7j, other parameters that describe the density function of Mj cannot 
be learned with {A„} alone. 

We now consider more general models for {Mj} that potentially allow {1^} or the joint 
sequence {(A„, Yn)} to result in better inferencing on the states than {A„}. In this general 
setting, the density function of depends on {Mj} not only through Mi, but also on the 
precise arrangement of attacks over the 5-day period. Thus estimating the other parameters 
using either {Yn} or {(A„, Yn)} becomes difficult, if not impossible. Nevertheless, under 
the special case of a hurdle-based geometric model (a simple two-parameter extension of 
the geometric model) elucidated in Table 6, it is possible to estimate jj and fij using the 
joint observation sequence {(A„, Yn)}. Following the same steps as in Section A. 2, the joint 
density can be written as 

P{Yn = r,Xn = k)= Q (^^ - . (1 - 7,)'-'=(7i)' • (1 - 

This density function is log-concave in the parameters and hence the Baum- Welch estimates 
can be obtained as follows: 

^, ^ En=l^n7n(j) 

^ '5-En=l7n(i) 
^. ^ En=l(>^n-A„)7„(i) ^ 

^ En=l^n7n(j) 

where the iterative update for 7n(j) is obtained from (Rabiner, 1989, Sec. IIIA and B) 
using the joint density of {(A„, Yn)}. 

APPENDIX B: BACKGROUND INFORMATION ON FARC AND SHINING PATH 

This supplementary section motivates the choice of the terrorist groups and the corre- 
sponding time-periods of interest that are the focus of this work. 

B. l. FARC. The choice of FARC has been motivated by the following reasons: 

• Among all terrorist groups active in Colombia over the 1998 to 2006 time-period, 
FARC is the most dominant with few groups rivalling FARC in terms of strength. 

• Being a group of the Marxist-Leninist proclivity, FARC leaves a tell-tale signature 
in terms of its attack profile. For example, in the 1998 to 2006 time-period that 
encompassed a total of 604 incidents from the RDWTI database, 512 incidents (or 
84.77% of all incidents) could be classified as either an incident on the establishment 
or infrastructure. 
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• Over the same time-frame, the competing group Ejercito de Liberacion Nacional 
(ELN) could not consoHdate its mihtary prowess, was geographically isolated and 
hence remained sub-dominant relative to FARC. ELN's categorical rejection of any 
participation in the drug economy (which is posited to have been responsible for its 
poor strength) and the smaller demilitarized zone secured by it in negotiations with 
the government provide further evidence to these claims. 

• FARC's signature of attacks and ELN's relative dormancy and geographical isolation 
allows us to classify ambiguous terrorism incidents (on the establishment or infras- 
tructure) as FARC activities with a high degree of confidence. 

The 1998 to 2006 time-period is chosen because of the following reasons. In addition to 
having a clear record of FARC's activity profile over this period, we are aware of two 
clear and well-defined geo-political undercurrents (events) that have had a major impact on 
FARC's activity profile over this period: 

• Event 1: With Colombia becoming the leading cultivator (World Drug Report, 2010, 
Fig. 21) of coca by 1997, the Colombian government under President Andres Pastrana 
and the U.S. government under President William J. "Bill" Clinton instituted a legis- 
lation, popularly known as Plan Colombia, under which, among other tasks, U.S. aid 
for counter-narcotics efforts increased significantly. In the ensuing Presidential elec- 
tions in 2001-02, Alvaro Uribe won the popular mandate on an anti-FARC platform. 
The impact of Plan Colombia and the impending election of President Alvaro Uribe 
are the primary reasons behind a spurt in the activity profile of FARC in the late-2001 
to 2002 time-period. 

• Event 2: With FARC adapting itself to the new equilibrium established by the 
institution of Plan Colombia by 2003-04, we observe a return to "normalcy" in terms 
of the activity profile of FARC. However, with elections to Colombia's Congress and a 
re-election bid by President Uribe in the mid- to late-2005 to 2006 period ensured that 
FARC, which had remained relatively passive in this period, attacked with renewed 
vigor. This is associated with another spurt in the activity profile of FARC for this 
period. See (Cragin and Daly, 2004; Beittel, 2010) (and references therein) for a more 
detailed description of these events and FARC activities. 

B.2. Shining Path. The choice of Shining Path has been motivated by the following 
reasons: 

• The period between 1981 and 1996 corresponds to a full-cycle in terms of Shining 
Path's evolution. Shining Path's evolution toward terrorism began with its inception 
as an ideological pro-leftist outfit in the late-60s followed by capture of the electoral 
body of some university campuses across Peru by the mid-70s. By the early-80s, 
Shining Path had consolidated and re-organized itself as a violent outfit. A cycle of 
strengthening of counter-insurgency operations, loss of resources by Shining Path, re- 
organization of available resources, and consequent violence resulted in the capture 
of its primary ideologue and leader (Abimael Guzman) in 1992. With this key event, 
Shining Path entered a phase of organizational catastrophe and decay. While the 
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organization itself continued to exist in name and through spUntered sub-groups, it 
can be reasonably claimed that the original Shining Path outfit lasted only till the 
mid-90s. 

• As with FARC, Shining Path was the most dominant terrorist group active in Peru in 
the concerned time-period with the other co-existing sub-dominant group being the 
Movimiento Revolucionario Tupac Amaru (MRTA). Shining Path also left a tell-tale 
signature in terms of its targets and attack profile. In fact, 75 of the 163 (or 46.01% 
of all attacks) over this period from RDWTI were attacks on the establishment with 
the rest of the attacks being attacks on soft targets. Thus, any ambiguity with regard 
to incomplete data could be resolved easily. 

The data over this period shows the following rich array of geo-political undercurrents 
resulting in either a spurt or a downfall (at different times) in the activity profile: 

• Rise and stabilization (1981-1986): As Shining Path transformed into a violent 
terrorist group in the early-80s, an initial tepid response from the Government of Peru 
resulted in a period where Shining Path quickly seized the initiative and established 
an equilibrium in terms of activity profile. 

• Resurgence (1987-1990): Attacks on the 1985 Presidential elections resulted in 
massive counter-insurgency operations by the Government of Peru thereby leading to 
a slight lull in activity in 1985-86. This event is followed by re-organization of the 
outfit corresponding to a major resurgence in Shining Path's activity profile in early 
1987. However, this activity profile stabilized subsequently. 

• Heightened state of activity ( 1 991 ) : The third phase is marked by an excessively 
violent 1991. By the end of 1991, Shining Path had control of much of the countryside 
of the center and south of Peru and had a large presence in the outskirts of Lima. 

• Death and decay (1992-1996): As the organization grew in power, a personality 
cult grew around its leader, Abimael Guzman and his philosophy of communism stood 
elevated to the pantheon of Marxist-Leninist-Maoist thought. However, disrespect for 
the indigenous culture of Peru, its economic policies that were inimical to indige- 
nous interests, and the excessive violence and vengeance unleashed by Shining Path 
ensured its massive unpopularity among Peruvians. This institutional apathy in not 
ensuring the support of the local population led to Guzman's capture in September 
1992. Shortly, thereafter, most of the leadership of Shining Path fell as well. With 
the last vestiges of rebellion flaming out by 1993, the death of Shining Path as a 
terrorist outfit is confirmed by a record low number of terrorism incidents in this 
period. See (Cragin and Daly, 2004) for a more detailed description of these events 
and Shining Path activities. 
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